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Method for identification of suitable fragmentation sites in a 
reporter protein 

The present invention is related to the field of methods for de- 
tecting the interaction of proteins via the use of fusion pro- 
teins, commonly referred to as split-protein sensors or two- 
hybrid assays. 

The introduction of the yeast-two hybrid system by Fields and 
Song in 1989 was a milestone for the analysis of protein-protein 
interactions in living cells (cf. US 5,667,973 and Fields, S., 
and Song, O. (1989), Nature 340, 245-246). However, a major 
limitation of this classical two-hybrid system lies in its re- 
striction to the detection of those protein-protein interactions 
that can be reproduced within the nucleus of a yeast cell. To 
overcome this restriction, an alternative to this two-hybrid 
method was introduced in 1994 by Johnsson and Varshavsky (cf . WO 
95/29195 and Johnsson, N., and Varshavsky, A. (1994), Proc Natl 
Acad Sci U S A 51, 10340-10344) . Here, the two interacting pro- 
teins are expressed as fusion proteins with an N- and a C- 
terminal fragment of ubiquitin. Upon interaction of the two pro- 
teins a quasi-native ubiquitin is formed and subsequently recog- 
nized by ubiquitin-specif ic proteases, resulting in the cleavage 
of a reporter . protein from the C-terminal fragment of ubiquitin. 
The split-ubiquitin system allows for the detection of interac- 
tions between cytoplasmic as well as membrane proteins- Since 
the introduction of split-ubiquitin, a variety of other split- 
protein sensors has been developed, including pairs of fragments 
of dihydrofolate reductase (DHFR) , p-galactosidase, p-lactamase, 
inteins, green fluorescent protein (GFP) , cAMP cyclase, glycina- 
mide ribonucleotide transf ormylase, aminoglycoside phosphotrans- 
ferase, hygromycin B phosphotransferase, and lucif erase (cf . 
Remy, I., and Michnick, S.W. (1999), Proc Natl Acad Sci U S A 
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96, 5394-5399; Rossi, F. , Charlton, C.A., and Blau, H.M. (1997), 
Paroc Natl Acad Sci USA 94, 8405-8410; Galarneau, A., Priitieau, 
M. , Trudeau, L.E., and Michnick, S.W. (2002), Nat Biotechnol 20, 
619-622; Wehrman, T., Kleaveland, B., Her, J.H., Balint, R.F., 
5 and Blau, H.M. (2002), Proc Natl Acad Sci U S A 55, 3469-3474; 
Ozawa, T., Nogami, S., Sato, M., Ohya, Y., and Umezawa, Y. 
(2000), Anal Chem 72, 5151-5157; Ozawa, T,, Kaihara, A., Sato, 
M- , Tachihara, K. , and Umezawa, Y- (2001), Anal Chem 73, 2516- 
2521; Ghosh, I., Hamilton, A.D., and Regan, L. (2000), Journal 

10 of the American Chemical Society 122, 5658-5659) . Among these 

systems only split-ubiquitin was successfully applied to screen 
for binding partners. Other sensors were used to monitor the in- 
teractions between selected pairs of proteins rather than to 
f±nd new partners by a random library approach. Robust systems 

15 that can be used for identifying interaction partners at any lo- 
cation inside the cell and in different hosts are therefore 
still needed. Ideally the interaction-induced reassociation of 
such a split-protein sensor would provide the cell with a growth 
advantage thus allowing a selection for interacting proteins. 

20 However, generating new split-protein sensors is technically de- 
manding as it depends critically on identifying suitable frag- 
ments that can reconstitute a native-like and active protein. 
The chosen fragmentation site has to fulfill at least the fol- 
lowing criteria: (i) to yield two fragments that efficiently 

25 fold into quasi-native protein only when fused to two interact- 
ing proteins; (ii) not to significantly impair the activity of 
the reconstituted protein; (iii) to yield soluble protein frag- 
ments that are not readily degraded in vivo. In previous stud- 
ies, the challenge of rationally finding such sites has been 

30 mostly tackled by trial and error. 

It: is thus an object of the present invention to overcome the 
above-mentioned drawbacks of the prior art, i.e. to provide a 
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method for identification of suitable fragmentation sites in a 
reporter protein especially for use as a split-protein sensor, 
that is not limited by the above-mentioned drawbacks of rational 
design, and which especially allows for the identification of 
5 suitable fragmentation sites in a reporter protein even in the 
absence of any structural information such as a crystal struc- 
ture. Further objects of the invention will become apparent to 
the person of routine skill in the art in view of the following 
detailed description of the invention. 

10 

This object and yet further objects are achieved inter alia by a 
method for the identification of suitable fragmentation sites in 
a reporter protein, and related thereto, recombinant DNA se- 
quences and, encoded thereby, first and complementary second 

15 subdomains of a reporter protein, host cell lines transformed 
with said recombinant DNA sequences, a kit of parts comprising 
DNA-based expression vectors, a method for detecting an interac- 
tion between proteins, a use of random circular permutation and 
a use of a host cell line allowing for homologous recombination 

20 according to the independent claims. 

Most biological processes are controlled by protein-protein in- 
teractions and split-protein sensors have become one of the few 
available tools for the characterization and identification of 

25 protein-protein interactions in living cells. Here we introduce 
a generally applicable combinatorial approach for the generation 
of new split-protein sensors and apply it to the (p/a) g-barrel 
enzyme N- (5^ -phosphoribosyl) -anthranilate isomerase Trplp from 
Saccharomyces cerevislae (cf. Braus, G.H., Luger, K. , 

30 Paravicini, G., Schmidheini, T., Kirschner, K. , and Hutter, R. 
(1988), J Biol Chem 263, 7868-7875). These so-called split-Trp 
protein sensors are capable of monitoring the interactions of 
pairs of cytosolic and membrane proteins. One of the selected 
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split-Trp pairs (^^Ntrp and ^^Ctrp) was chosen by means of an exam- 
ple and successfully applied to monitor protein-protein interac- 
tions both at the membrane as well as in the cytosol of yeast. 
Its selected fragmentation site would not have been easily pre- 
5 dieted by theoretical considerations, thus underlining the power 
of the evolutionary approach according to the invention. The di- 
rect read-out through complementation of tryptophan auxotrophy 
qualifies the split-Trp system for high-throughput applications 
in yeast and bacteria. Of course, appropriately engineered trpl- 

10 deficient host strains are required for such assays, which are 
however either readily available or easily to be made by the 
person of routine skill in the art. In addition, the introduced 
combinatorial approach allows for generating split-protein sen- 
sors of almost any reporter protein, thereby yielding tailor- 

15 made sensors for different applications. 

Trplp is a relatively small (25 kD) , monomeric protein that 
catalyzes the isomerization of N- (5' -phosphoribosyl) - 
anthiranilate in the biosynthesis of tryptophan (cf . Eberhard, 

20 M., Tsai-Pflugf elder, M., Bolewska, K. , Hommel, U., and Kir- 

schner, K. (1995), Biochemistry 34, 5419-5428). The DNA coding 
seqoence of Saccharomyces cerevisiae is given in SEQ ID NO: 1, 
the corresponding amino acid sequence is given in SEQ ID NO: 2. 
Creating a pair of Trplp fragments (split-Trp) that only recon- 

25 stitute the enzymatic activity when linked to interacting pro- 
teins allows monitoring this protein interaction through a sim- 
ple growth assay: otherwise trpi yeast strains expressing such a 
spl±t-Trp fusion pair would not be able to grow on medium lack- 
ing tryptophan. As many different trpl strains exist, the inter- 

30 act±on assay could be applied immediately in different genetic 
backgrounds, adding a further attractive feature to a split-Trp 
sensor. Trplp is a well-studied member of the prominent class of 
proteins that fold into a (p/a) s-barrel, which is the most com- 
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monly occurring fold among enzymes. The herein presented ap- 
proach of identifying suitable fragmentation sites in a reporter 
protein is thus very broadly applicable. This folding motive has 
been pireviously subjected to circular permutation and has been 
expressed as two separate fragments that spontaneously associate 
into a functional enzyme (cf- Luger, K. , Hommel, U., Herold, 
Hofsteenge, J., and Kirschner, K. (1989), Science 243, 206-210; 
Eder, J,, and Kirschner, K. (1992), Biochemistry 31, 3617-3625). 
Further-more, it has been proposed that the (p/a) s-barrel evolved 
by tanciem duplication from a (P/a) 4-domain (cf. Hocker, B., 
Schmidt, S., and Sterner, R. (2002), FEBS Lett 510, 133-135). In 
addition to any practical applications it would therefore add to 
our understanding where the (p/a) s-barrel can be split into two 
fragments that, in contrast to previously described pairs of 
fragments, reconstitute quasi-native Trplp only when fused to 
interacting proteins. 

As used herein, a ^^reporter protein" is understood as a protein 
or peptiide, which possesses a. unique activity in vivo and/or in 
vitro, and which produces a signal that allows the active pro- 
tein to be easily discernable even within a complex mixture of 
other proteins or peptides, especially in vivo. Reporter pro- 
teins as understood herein are e.g. (i) proteins which are es- 
sentially involved in the biosynthetic pathway of formation of 
an amino acid or an other essential metabolite that is crucial 
for the organism to survive on medium lacking the respective 
amino acid or metabolite; or (ii) proteins which are detectable 
by a characteristic color assay when, preferably in vivo; etc. 

As used herein, a '^suitable fragmentation site" is understood as 
an especially randomly chosen position in the amino acid chain 
(and/ojc the corresponding gene sequence, respectively) , at which 
a given reporter protein is fragmented into a first subdomain 
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and a complementary second subdomain (and/or the corresponding 
first subsequence and the complementary second subsequence, re- 
spectively) , wherein the fragmentation site is suitable in the 
sense of the present invention, when it fulfils the following 
demands: (i) to yield two fragments that efficiently fold into 
quasi-native protein only when fused to two interacting pro- 
teins; (ii) not to significantly impair the activity of a recon- 
stituted, protein by bringing the two fragments into close prox- 
imity especially in vivo; (iii) to yield soluble protein frag- 
ments th.at are not readily degraded in vivo. 

As used herein, the term 'detectable'', especially 'detectable 
when active" is understood as follows. Detection in the sense of 
the present invention includes any direct or indirect method of 
testing for the presence of a reporter protein, especially when 
reconstituted by fragments thereof, e-g. by chemical, physical, 
or visual means. Most preferably, detection is performed by a 
color assay, e.g. fluorescence, chemiluminescence or the like, 
(in vivo and/or in vitro) and/or a growth assay (in vivo) . 

As used herein, a "'first subdomain" and a "complementary second 
subdomaLn" of a reporter protein are understood as follows. A 
first siabdomain represents a first successional part (either an 
N-terminal-, C-terminal-, integral part or even a part involving 
both the N-terminal- and the C-terminal part) of a native re- 
porter protein. A complementary second subdomain represents a 
complementary second part (either an N-terminal, C-terminal, in- 
tegral part or even a part involving both the N-terminal- and 
the C-terminal part) . The first subdomain and the complementary 
second subdomain essentially resemble the wild-type sequence, 
when viewed together, wherein overlapping sequences between both 
subdomaxns, that are present in both the first subdomain and the 
complementary second subdomain can be tolerated as long as the 
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activity of the enzyme is not significantly negatively affected. 
Moreover, minor deletions, additions or other alterations to the 
overall sequence can be tolerated, especially at the N-terminus 
or the C-terminus, as long as the activity of the reporter pro- 
5 tein, either sls a whole or when reconstituted by its fragments, 
is not signif ±cantly negatively affected. 

As used herein, a ""first subsequence" and a ""complementary sec- 
ond subsequenoe''' are understood as gene sequences encoding for 
10 the above-men-tioned first subdomain and complementary second 
subdomain. 

As used herein, a ""color assay" is understood as a manually or 
device-supporiied detection of a change in optical appearance of 

15 a sample compjrising the reporter protein, or a reporter protein 
reconstituted by its fragments, incl. color developments as well 
in the visible as in the invisible spectrum. Color assays are 
especially preferred, that can be qualitatively detected by the 
unaided eye e.g. by coloration of living cells in vivo (colonies 

20 on a plate or the like) , and that can be additionally quantified 
in an in vitro assay, e.g. for determining the intensity of an 
interaction between two proteins . 

As used herein, a ""growth assay" is understood as an assay, that 
25 allows for the growth of a cell, e.g. a colony on a plate, when 
the reporter protein is present or actively resembled by its 
fragments, and wherein cells fail to grow, when the reporter 
protein is not: present or actively resembled by its fragments. 
Most preferably, the growth assay suchlike allows for a simple 
30 visual select ±on of positives . 

As used herein, "^stringent conditions" for hybridization of DNA 
are understood as follows. Given a specific DNA sequence, a per- 
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son of skiLl in the art would not expect substantial variation 
among species within the claimed genus due to hybridization un- 
der such conditions, thus expecting structurally similar DNA. 

The method according to the invention for the identification of 
suitable fragmentation sites in a reporter protein, wherein the 
reporter protein is detectable when active, comprises the steps 
of: 

(a) providing a DNA sequence encoding for said reporter protein 

(b) creating a library based on the DNA sequence as defined in 
(a) , 

wherein each individual of said library comprises a 
randomly created first subsequence of the DNA sequence 
as defined in (a), encoding for a first subdomain of 
said reporter protein, and 

wherein each individual of said library comprises a 
randomly created complementary second subsequence of 
the DNA sequence as defined in (a) , encoding for a com 
plementary second subdomain of said reporter protein; 

(c) screening and/or selection for restoration of detectable ac 
tivity^ of said reporter protein, when said first subdomain 
and said complementary second subdomain are brought into 
close proximity; 

(d) identifying said first subdomain and/or said first subse- 
quence, and said complementary second subdomain and/or said 
complementary second subsequence, that lead to restoration 
of detectable activity of said reporter protein - 

By using a combinatorial library approach, comprising randomly 
created first subsequences and randomly created complementary 
second subsequences, the drawbacks of rational design of split- 
protein sensors are overcome. Most advantageously, even fragmen- 
tation sites of proteins encoded by said subsequences may 
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thereby be identified, which would have never been readily pre- 
dicted by any rational approach. First subsequences and comple- 
mentary subsequences are ideally suitable in the context of the 
present invention, when reconstitution of activity of the corre- 
5 spending reporter porotein only occurs to a significant extent at 
all, when both corresponding subdomains are forced into close 
spatial proximity, but do not self-assemble in order to recon- 
stitute a detectable amount of an active reporter protein. 

10 DNA sequences of suitable reporter proteins are readily avail- 
able to the person of routine skill in the art (step (a)), e.g. 
from the National Center for Biotechnology Information (NCBI) , 
National Library of Medicine, Building 38A, Bethesda, MD 20894. 
Genes encoding for reporter proteins may then be amplified e.g. 

15 from a suitable host cell by PGR using standard techniques and 
primers suitably designed based on the known DNA sequence (vide 
supra) , or the gene encoding for a reporter protein may be com- 
pletely built up from suitably designed oligonucleotides de 
novo . 

20 

DNA manipulating techniques that may be used in step (b) for the 
creation of a libraory based on said DNA sequence are readily ap- 
parent to the person of routine skill in the art, either. In 
short ,^ N- and C-terminal domains of the wild-type reporter pro- 

25 tein are amplified separately from a suitable source of DNA by 
standard PGR technic5ues/ and are subsequently recombined using 
standard overlap extension PGR techniques in order to recoit±)ine 
and thereby re-arrange the wild-type gene, preferably now con- 
taining the N- and C-termini of the wild-type gene connected 

30 with each other and as an internal part of the sequence, and 

preferably comprising a unique restriction site at the wild-type 
N- and G-termini. At the same time, suitable restriction sites 
may be designed at the newly created N- and G-termini in order 
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to allow for efficient subsequent cloning steps; most prefera- 
bly, the restriction site is designed for the same restriction 
enzyme at both the N- and C-terminus. Most preferably, the re- 
arranged DNA construct is inserted into a high-copy plasmid, the 
5 plasmid amplified by standard techniques, and the re-arranged 
DNA of interest is thereafter cut out of the high-copy plasmid 
using the restriction sites at the newly created N- and C- 
termini. The rearranged gene is then incubated with a ligase to 
yield dimerized, oligomerized and circularized DNA construct. 

10 Afterwards, these constructs are digested e.g. with a suitable, 
random-cut DNAse, and fragments corresponding to the wild-type 
length are preferably thereafter treated with ligase and poly- 
merase to repair nicks, gaps and to flush the ends of the frag- 
ments of the reporter protein. Afterwards, the DNA fragments 

15 corresponding to the wild-type length of the reporter protein'' s 
gene are isolated e.g. by standard agarose gel electrophoresis 
procedures. The resulting fragments are preferably blunt-end 
cloned into a suitable expression vector, which was cleaved at a 
unique restriction site (preferably blunt-end) . The expression 

20 vector is especially designed by standard DNA manipulation tech- 
niques to provide a construct after blunt-end cloning, in which 
one of the artificially generated new N- and C-termini is under 
the control of a promoter sequence and especially fused to a 
gene encoding for a tag sequence and a gene encoding for first 

25 peptide or protein CI, each preferably via a linker sequence. 

Moreover, the other terminus, respectively, is especially fused 
to a gene encoding for a preferably different tag sequence and 
gene encoding for a second peptide or protein C2- Peptides or 
proteins CI and C2 are thereby known to interact with each other 

30 in vivo, and may e.g. be leucine zippers. The tag sequences may 
afterwards advantageously be used for the control of correct ex- 
pression and stability of fusion proteins. After transformation 
and amplification in a suitable host such as e.g. E.coli XLlBlue 
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to a typical library size of about 10^ to 10^ independent clones, 
the vector is linearized at a restriction site at the wild-type 
N- and C-termini, and an oligonucleotide is inserted into the 
resulting gap, which is specifically designed to integrate a 
terminator for the first domain of said reporter protein and a 
promoter sequence for the second domain of said reporter pro- 
tein, by homologous recombination in a suitable host such as 
yeast according to standard procedures. The oligonucleotide is 
designed and constructed by standard PGR techniques to provide 
flanking regions both at the 5' and 3' ends of e.g. about 50bp 
with the gene of the reporter protein in order to allow for suc- 
cessful homologous recombination- Suchlike, the selection of 
clones possessing fragmentation sites at or nearby the wild-type 
N- and C-termini can be suppressed. For selecting thereafter, a 
marker gene is also provided by the oligonucleotide, e.g. encod- 
ing for a protein involved in antibiotic resistance. Successful 
homologous recombination may thus be easily observed by growth 
in the presence of the respective antibiotic. 

Step (c) is preferably carried out by growing the respective 
transformants of the library on medium which e.g. lacks a nutri- 
ent, e.g. an amino acid, or which provides a substrate for a 
color reaction. Thus, preferably a growth assay or a color assay 
is performed, thereby allowing for easy selection of those 
transformants which lead to a restoration of activity of the re- 
porter protein, which is e.g. essentially involved in the syn- 
thesis of said nutrient, e.g. said amino acid, or in said color 
reaction. Step (c) especially involves the elimination of false 
positives, i.e. first subdomains and complementary second subdo- 
mains, that reconstitute an active reporter enzyme by self- 
reassembling, i.e. without the need of an outer influence forc- 
ing the two domains into close spatial proximity. This can be 
done e.g. by fusing the respective first and second subdomains 
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of the reporter protein to first and second peptides or pro- 
teins, that do not intearact with each other, and/or by testing 
the respective first and second subdomains without any first and 
second peptides fused thereto at all, and/or by testing con- 
5 structs lacking the first or the second subdomain, respectively. 
These assays can be performed by techniques commonly known in 
the art of e.g. two-hyborid assays. 

Identification of suitable subdomains and subsequences, i.e. 
10 suitable f ragementation sites, can be performed by common DNA- 
and/or protein sequencing techniques. 

According to a preferred embodiment, the reporter protein is de- 
tectable in vivo and/or in vitro, both as full length protein 
15 and when actively resemloled by a first subdomain and a comple- 
mentary second subdomain, by a means chosen from the group con- 
sisting of color assays and growth assays. 

Growth assays provide the advantage of a selection step, i.e. 

20 only positives grow under the chosen conditions, thus eliminat- 
ing the need of further screening all individuals of the li- 
brary. Exemplarily, only positives that comprise a suitable com- 
bination of first subdomain and complementary second subdomain 
grow as colonies on nutrition-specific plates. Color assays, 

25 moreover, can be individually designed depending on the specific 
reporter protein, when tihis reporter protein is involved natu- 
rally in or artificially usable for a color-developing reaction. 
In some cases, a substrate for such a reporter protein may be 
incorporated into the growth medium, e.g. the plate, whereupon 

30 colored colonies appear due to reconstitution of an active re- 
porter protein by a first subdomain and a complementary second 
subdomain in vivo. Quanlf ication of such an in vivo color assay 
may be optionally performed with samples obtained from such 
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colonies. The general procedure of growth assays, color assays 
and subsequent quantification of the color assay are known in 
principle from the classical two-hybrid system, cf . eg. US 
5,567,973, incorporated herein by reference. 

5 

In an especially preferred embodiment, individuals of the li- 
brary as defined in (b) are either prokaryotic or eukaryotic 
host cells, comprising: 

both said first subsequence and said complementary second 
10 subsequence in one and the same expression vector, suitable 

for (CO-) expression of said first subsequence and said 

complementary second subsequence in vivo; or 
- said first subsequence in a first expression vector suitable 

for (CO-) expression of said first subsequence, and said com- 
15 plementary second subsequence in a second expression vector 

suitable for (co-) expression of said complementary second 

subsequence . 

In vivo assays are at least in the first step preferred, e.g. as 
20 a growth assay as outlined above. Thus, prokaryotic or eu- 
karyotic host cells a.re provided, that are manipulated suchlike 
to allow for the (co— ) expression of both the first and the com- 
plementary second subdomain of the reporter protein. Depending 
on the specific application, both subdomains may of course be 
25 encoded by one and the same, or by separate vectors. In most 

cases, encoding by one and the same vector will be favourable. A 
vast amount of suitable expression vectors for use as a basis in 
this respect are available to the person of routine skill in the 
art, e.g. the pRS316— based yeast expression vector (cf. Sikor- 
30 ski, R.S., and Hieter, P. (1989), Genetics 122, 19-27, incorpo- 
rated herein by reference) . 



wo 2005/038050 



14 



PCT/EP2004/011289 



It is especially preferred that the screening for restoration of 
detectable activity of said reporter protein, when said first 
subdomain and said complementary second subdomain are brought 
into close proximity as defined in (c) , comprises the following 
5 steps: 

creating a first fusion subsequence comprising the first 
subsequence of said reporter protein as defined in (b) , 
fused to an oligonucleotide encoding for a first protein or 
peptide, 

10 - creating a second fusion subsequence comprising the comple- 
mentary second subsequence of said reporter protein as de- 
fined in (b) , fused to an oligonucleotide encoding for a 
second protein or peptide, 
wherein said first protein or peptide and said second protein or 

15 peptide are known to interact. 

By creating said first fusion sequence and said second fusion 
subsequence, the first subdomain and the complementary second 
subdomain are forced into cLose spatial proximity, thus allowing 

20 for a screening for restoration of activity of the reporter pro- 
tein, when the subdomains arre forced into close proximity. Pref- 
erably, said first protein or peptide and said second protein or 
peptide are chosen to be robust and relatively small proteins or 
peptides; especially preferr^ed in the context of the invention 

25 are leucine zippers, most preferably leucine zippers which asso- 
ciate to an anti-parallel coiled coil (interacting proteins 
fused to 3' -terminus of the first subdomain and the 5' -terminus 
of the second subdomain, or vice versa, respectively) . However, 
for specific embodiments, a parallel orientation may be pre- 

30 f erred, e.g. for testing membrane proteins which most commonly 
exhibit both the N- and the C-terminus to one and the same site. 
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According to a further embodiment said first fusion subsequence 
and said second subsequence are created by blunt end ligation. 

Blunt end ligation is the method of choice for the construction 
5 of said fusion subsequences, as due to the evolutionary, random 
approach of library generation no predictable, specific sticky- 
end ligation can be performed- Although blunt-end ligation leads 
to the creation of statistical amounts of ligation products 
which are out of the reading frame, this approach still proved 
10 sufficiently efficient for the identification of suitable frag- 
mentation sites according to the invention. 

Moreover, in another especially preferred embodiment said first 
fusion subsequence and said second fusion subsequence each com- 
15 prise 

- a linker sequence in between said first subsequence (or said 
second subsequence, respectively) and said oligonucleotide 
encoding for a first protein or peptide (or said oligonu- 
cleotide encoding for a second protein or peptide, respec- 
20 tively) ; 

at least one tag that allows for verification of the tran- 
scription of said f±rst fusion subsequence and said second 
fusion subsequence. 

25 Linker sequences commonly prove useful in the art of construc- 
tion of fusion proteins ±n order to both allow for proper fold- 
ing of both components of the fusion protein individually or co- 
operatively, and/or to achieve sufficient spatial integrity of 
both components of the fusion protein. 

30 The use of tag sequences that allow for the detection of tran- 
scription of a gene sequence is also routinely applied in the 
art. In the context of the present invention, tag sequences may 
be applied to any of the N- and C-terminus of the first subdo- 
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main and/or the N- and C-tentiinus of the complementary second 
subdomain. It is especially preferred to provide differently 
recognizable tag sequences botti at the N- and the C-termini of 
each transcription product. Commonly applied tags are e.g. the 
5 HA tag, the flag tag or the like. Detection of correct expres- 
sion of these tags, and thereby of the fusion protein (s), may be 
performed e.g. by Western-blotting according to routine proce- 
dures . 

10 According to an especially preferred embodiment, an oligonucleo- 
tide is inserted by homologous recombination in between said 
first subsequence and said second subsequence, encoding for: 

a transcription terminating sequence for terminating tran- 
scription of said first ox* said second subsequence; 
15 - a transcription promoting sequence for initiating transcrip- 
tion of said second or sa±d first subsequence, respectively; 
a marker sequence allowing for control of successful homolo- 
gous recombination. 

20 An especially advantageous way of carrying out the present in- 
vention is to simply initially provide said first and said sec- 
ond subsequence continuously, preferably rearranged, and there- 
after to separate them by introducing a transcription terminat- 
ing sequence succeeding the first subsequence, and a transcrip- 

25 tion promoting sequence preceeding the second subsequence. 

Thereby, separate expression is secured of both the first subdo- 
main and the complementary second subdomain, or their fusion do- 
mains, respectively. This goal may be especially advantageously 
achieved by homologous recombination at a predefined site in be- 

30 tween said first and said second subsequence (c.f . Oldenburg, 
K.R-, Vo, K.T., Michaelis, S., and Paddon, C. (1997), Nucleic 
Acids Res 25, 451-452, incorporated herein by reference) . 
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In order to eliminate the otherwise high risk of isolating sub- 
domains that are fragmented at fragmentation sites nearby the 
N- and C-termini of the wild— type reporter protein, it is espe- 
cially preferred to not provxde the DNA sequence of said re- 
5 porter protein according to step (a) , vide supra, in its wild- 
type configuration, but rather already with the wild-type N- and 
C-termini connected with each other and being an internal part 
of the DNA sequence of said DNA sequence. Thereby, artificial 
new N- and C-termini are created in the starting material. Most 

10 preferably, a unique restriction site RE2 is introduced in be- 
tween the wild-type N- and C— terminus. A further restriction 
site REl is advantageously introduced at the new artificial N- 
and C-terminus of the DNA secjuence of said reporter protein ac- 
cording to step (a) , allowing for easy and convenient cloning 

15 and construction of libraries according to step (b) , vide supra. 
Due to the unique restriction site RE2, homologous recombination 
in a suitable host cell can loe performed in between the wild- 
type N- and C-terminus of the reporter protein. Due to the nec- 
essary overlap for successful homologous recombination, isola- 

20 tion of subdomains with fragmentation sites at or nearby the 

wild-type N- and C-terminus ±s suppressed. Most preferably, the 
oligonucleotide used for homologous recombination comprises a 
selection marker such as e.g. a gene involved in antibiotic re- 
sistance in order to check for successful homologous recombina- 

25 tion. 

Thus, in a further embodiment, the method comprises the steps 
of: 

creating fragmentation sites in TRPl using gene cleavage 
30 with a unique restriction enzyme REl and circularization; 

isolating fragments corresponding to the wild-type length; 
subcloning using blunt ends preferably into a pRS316 based 
yeast expression vector under the control of a copper pro- 
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moter (pCUBl) and transforming into E. coli, preferably 
XLlBlue; 

recombining and amplifying homologues with a unique restric- 
tion site RE2, preferably Avrll, introduced between the 
original N- and C-termini to allow subsequent linerization 
of the vector; 

locating two leucine zippers in the plasmid at the 3'- and 
the 5' -ends of the newly generated N- and C-termini, the 
zippers being positive and negative charged helices to allow 
heterodimerization, preferably each heterodimer containing a 
buried asparagine residue in a position to force antiparal- 
lel orientation of the zippers. 

The invention further relates to a recombinant DNA sequence for 
use in securing expr-ession in a prokaryotic or eukaryotic host 
cell of a polypeptide product having the primary structural con- 
formation of a first subdomain of a reporter protein or a com- 
plementary second siabdomain of a reporter protein, wherein de- 
tectable activity of said reporter protein is restored, when 
said first subdomain and said complementary second subdomain are 
brought into close proximity, and wherein said first and said 
complementary second subdomain are not subdomains of one of the 
group of proteins consisting of transcriptional activators, 
ubiquitin, dihydrof olate reductase, p-lactamase, green fluores- 
cent protein and closely related variants such as e.g- ECFP, 
EGFP or the like, p— galactosidase, inteins, cAMP cyclase, glyci- 
namide ribonucleotide transf ormylase, aminoglycoside 

In the above-mentioned and herewith disclaimed DNA sequences, 
suitable fragmentation sites for split-protein sensors were al- 
ready identified by rational design (cf - e.g. Methods Enzymology 
238, Michnick et al . 2000) . However, the present invention now 
opens up for the first time the possibility to identify suitable 
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fragmentation sites in any other DNA sequence encoding for a re- 
porter protein by a random library approach^ too. Providing this 
tool to the person of routine skill in the art by the method 
disclosed herein, suitable fragmentation sites may be now iden- 
5 tified with relative ease. 

In especially preferred embodiments, said DNA sequence encodes 
for a subdomain of a (P/a) s-barrel enzyme, such as e,g. Trplp. 

10 In further embodiments, wtiich proved especially advantageous, 
said DNA sequence is selected from the group consisting of: 

(a) the DNA sequences set out in Table 1 and their complementary 
strands ; 

(b) DNA sequences which hybridize under stringent conditions to 
15 the protein coding regions of the DNA sequences defined in 

(a) or fragments thereof; 

(c) DNA sequences which, but for the degeneracy of the genetic 
code, would hybridize to the DNA sequences defined in (a) or 

(b) and which sequences code for a polypeptide having the 
20 same amino acid sequence. 



The above-mentioned DNA sequences encode for the split-Trp sen- 
sors split-Trp^^ (i.e. ^^trp and ^^Ctrp) ^ split-Trp" (i.e. ^^Ntrp and 
"Ctrp)/ split-Trp^^'' (i.e. ^^'^Ntrp and ^^^Ctrp) / split-Trp^^^^ (i.e. 
25 ^°^^Ntrp and ^°^*^Ctrp) r which proved to be valuable tools as split- 
protein sensors (numbering according to the fragmentation site, 
given as the last amino acid of the N-terminal subdomain) . Espe- 
cially split-Trp44 was successfully applied herein to demon- 
strate the interaction of membrane proteins. 

30 

The DNA- and amino acid sequences of the above-mentioned split- 
Trp sensors are given in the attached sequenced listing as fol- 
lows : 
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SEQ 


ID 


NO: 


3 


"Ntrp 


(DNA sequence) ; 


SEQ 


ID 


NO: 


4 




(.amino acid sequence) ; 


SEQ 


ID 


NO: 


5 


'-'trp 


(DNA sequence); 


SEQ 


ID 


NO: 


6 


•-trp 


(amino acid sequence) ; 


SEQ 


ID 


NO: 


7 




(DNA sequence) ; 


SEQ 


ID 


NO: 


8 




(amino acid sequence) ; 


SEQ 


ID 


NO: 


9 


53p 


(DNA sequence) ; 


SEQ 


ID 


NO: 


10 


53/^ 
^trp 


(amino acid sequence) ; 


SEQ 


ID 


NO: 


11 


Ntrp 


(DNA sequence) ; 


SEQ 


ID 


NO: 


12 


Ntrp 


(amino acid sequence) ; 


SEQ 


ID 


NO: 


13 


187p 
^trp 


(DNA sequence) ; 


SEQ 


ID 


NO: 


14 


187^ 
v-trp 


(amino acid sequence); 


SEQ 


ID 


NO: 


15 


2°^^trp (DNA sequence); 


SEQ 


ID 


NO: 


16 


^°^*^Nti:p (amino acid sequence) ; 


SEQ 


ID 


NO: 


17 


204^Ctrp (DNA sequence); 


SEQ 


ID 


NO: 


18 


^°^^Ctrp (amino acid sequence) ; 



In preferred embodiments according to the present invention, 
20 said DNA sequences are used in securing expression in a prokary- 
otic or eukaryotic host cell of a polypeptide fusion product. 
Such securing of expression may be achieved by any means rou- 
tinely applied by the person of routine skill in the art, com- 
prising e.g. incorporation of said DNA sequences into suitable 
25 expression vectors or Integration of said DNA sequences into the 
genome of said host. 

The invention further relates to a first subdomain of a reporter 
protein or a complementary second subdomain of a reporter pro- 
30 tein, wherein detectable activity of said reporter protein is 

restored/ when said first subdomain and said complementary sec- 
ond subdomain are brought into close proximity, and wherein said 
first and said complementary second subdomain are not subdomains 
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of one of the group of proteins consisting of transcriptional 
activators, ubiquitin, dihydr of olate reductase, p-lactamase, 
green fluorescent protein and. closely related variants such as 
e.g. ECFP, EGFP or the like, p-galactosidase, inteins, cAMP cy- 
5 clase^ glycinamide ribonucleotide transf ormylase, aminoglycoside 
phosphotransferase, hygromycin B phosphotransferase, lucif erase. 

In the above-mentioned and herewith disclaimed proteins, suit- 
able fragmentation sites for split-protein sensors were already 

10 identified by rational design. However, the present invention 
now opens up for the first time the possibility to identify 
suitable fragmentation sites in any other reporter protein by a 
random library approach, too- Providing this tool to the person 
of routine skill in the art by the method disclosed herein, 

15 suitable fragmentation sites may be now identified with relative 
ease. 

According to especially preferred embodiments of the invention, 
a first subdomain of a reporter protein or a complementary sec- 

20 ond subdomain of a reporter protein are produced by a method of 
culturing a host transformed with a recombinant DNA sequence as 
outlined above, wherein said molecules further comprises an ex- 
pression control sequence, said expression control sequence be- 
ing operatively linked to said molecule. Said expression control 

25 sequences comprise especially those which are commonly referred 
to as tags which are recognizable e.g. by Western-blotting pro- 
cedures routinely applied in the art. 

The invention further relates to a fusion protein comprising a 
30 first subdomain of a reportear protein or a complementary second 
subdomain of a reporter protein as outlined above, and a further 
peptide or protein connected thereto in a naturally not occur- 
ring combination. By creating such artificial fusion proteins. 
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said further protein of peptide may then be tested for interac- 
tion with e,g- a specifically chosen counterpart or against a 
library of possible counterparts. Moreover, library-library 
screening assays may also be applied, e.g. genome-wide library 
screenings as e.g. already performed in the art of traditional 
two-hybrid assay. 

The invention further relates to a prokaryotic or eukaryotic 
host cell line, transformed with recombinant DNA sequences as 
outlined above. 

Said prokaryotic or eukaryotic host cell lines are preferably E. 
coli or yeast strains- For cloning and storage purposes, mostly 
E. coli strains such as XLlBlue will be chosen. For the method 
of identification of suitable fragmentation sites according to 
the invention, especially involving the step of homologous re- 
combination, a yeast stra±n may be chosen such as e.g. Saccharo- 
myces cerevisiae, e.g. EGY48, and Schizosaccharomyces pombe. The 
choice of a suitable host cell line is routinely performed by 
the person of skill in the art, depending on the specific pur- 
pose; such host cell lines are commonly available. 

The invention is further related to a kit of parts, comprising a 
first and a second DNA-based expression vector, wherein 
- said first expression vector contains an expression cassette 
encoding for a polypeptide product having at least a sub- 
stantial part of the primary structural confirmation of a 
first subdomain of a reporter protein; and 
said second expression vector contains an expression cas- 
sette encoding for a polypeptide product having at least a 
substantial part of the primary structural confirmation of a 
complementary second subdomain of a reporter protein; and 
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wherein detectable activity of said reporter protein is re- 
stored, when said first subdomain and said complementary second 
subdomain are brought into close proximity, and wherein said 
first and said complementary second subdomain are not subdomains 
of one the group of proteins consisting of transcriptional acti- 
vators, ubiquitin, dihydro folate reductase, p-lactamase, green 
fluorescent protein and closely related variants such as e.g. 
ECFP, EGFP or the like, p-galactosidase, inteins, cAMP cyclase, 
glycinamide ribonucleotide transf ormylase, aminoglycoside phos- 
photransferase, hygromycin B phosphotransferase, lucif erase. 

According to a further especially preferred embodiment, such a 
kit of parts further comprising a suitable prokaryotic or eu- 
karyotic host cell line for expression of said first and second 
expression vector. 

Having provided by the present invention a tool for identifying 
novel fragmentation sites in reporter proteins, another major 
aspect of the present invention is related to a method for de- 
tecting an interaction between a first test peptide or protein 
or a fragment thereof, and a second test peptide or protein or a 
fragment thereof, the method comprising the steps of: 

providing recombinant DNA sequences as outlined above for 
use in securing expression of a first subdomain of a re- 
porter protein and a complementary second subdomain of a re 
porter protein; 

fusing an oligonucleotide or a gene encoding for a first 
test peptide or protein to the DNA sequence encoding for 
said first subdomain of the reporter protein, thereby creat 
ing a first DNA fusion sequence encoding for a fusion pro- 
tein comprising said first subdomain of the reporter protei 
and said first test peptide or protein; 
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fusing an oligonucleotide or a gene encoding for a second 
test peptide or protein to the DNA sequence encoding for 
said complementary second subdomain of the reporter protein^ 
thereby creating a second DNA fusion sequence encoding for a 
5 fusion protein comprising said complementary second subdo- 

main of the reporter protein and said second test peptide or 
protein; 

(CO-) expressing said fusion, protein comprising said first 
subdomain of the reporter protein and said first test pep- 
10 tide or protein, and said fusion protein comprising said 

second complementary subdomain of the reporter protein and 
said second test peptide or: protein in a suitable prokary- 
otic or eukaryotic host cell; 

screening and/or selecting for restoration of detectable ac- 
15 tivity of said reporter protein - 

Utilizing split-protein sensors with subdomains identified by a 
method according to the invention, interaction of said first 
test peptide and said second test peptide may be identified. 

20 Given the tool of identifying suitable fragmentation sites in 
virtually any reporter protein, the person of routine skill in 
the art is no more hampered by the limitations of the existing, 
rationally designed split-prote±n systems to specific cellular 
compartments, but rather may now choose a reporter protein de- 

25 pending on his specific test purrpose. 



In the most preferred embodiment:, a library of oligonucleotides 
or DNA encoding for a set of firrst test peptides or proteins 
and/or a library of oligonucleotides or DNA encoding for a set 
30 of second test peptides or proteins are fused to said first sub- 
domain of said reporter protein and/or said complementary second 
subdomain of said reporter protein, respectively. 
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According to an especially preferred embodiment of the present 
invention, the interaction between a first test peptide or pro- 
tein or a fragment thereof and a second test peptide or protein 
or fragment thereof is mediated by a chemical inducer of dimeri- 
5 zation, which binds either covalently or non-covalently to both 
said test peptides or proteins or fragments thereof - 

Comparable systems are commonly referred to in the literature as 
three-hybrid systems. Chemical inducers of dimerization (CIDs) 

10 have been first described by Schreiber and Crabtree (c.f. 
Spencer D.M, Wandless T.J, Schreiber S.L, and Crabtree G.R 
(1993), Science 262, 1019-1024, incorporated herein by refer- 
ence) • CIDs are cell-permeable molecules that can simultaneously 
form a covalent- or non-covalent interaction with two different 

15 proteins or peptides, thereby inducing their dimerization. Using 
split-protein sensors according to the present invention, e.g. 
robust drug and/or drug target screening assays may easily be 
established. Towards this aim, e.g. Ntrp may be fused to a pro- 
tein library and Ctrp to an O ( 6) -alkylguanine-DNA alkyltrans- 

20 ferase (AGT) , e.g. human AGT (hAGT) - A substrate for hAGT, e.g. 
Benzylguanine, may be easily covalently linked to a multitude of 
small molecules (hypothetical drugs) , thus allowing for an effi- 
cient screening for cellular targets contained in said protein 
library that react or associate with the corresponding drug. 

25 

Moreover, the invention is related to a method for detecting the 
interruption of an interaction between a first test peptide or 
protein or a fragment thereof, and a second test peptide or pro- 
tein or a fragment thereof, the method comprising the steps of: 
30 - providing recombinant DNA sequences according to one of 

claims 11 to 14 for use in securing expression of a first 
subdomain of a reporter protein and a complementary second 
subdomain of a reporter protein; 
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fusing an oligonucleotide or a gene encoding for a first test 
peptide or protein to the DNA sequence encoding for said 
first subdomain of the reporter* protein, thereby creating a 
first DNA fusion sequence encoding for a fusion protein com- 
prising said first subdomain of the reporter protein and said 
first test peptide or proteins- 
fusing an oligonucleotide or a gene encoding for a second 
test peptide or protein to the DNA sequence encoding for said 
complementary second subdomain of the reporter protein, 
thereby creating a second DNA fusion sequence encoding for a 
fusion protein comprising said complementary second subdomain 
of the reporter protein and sa±d second test peptide or pro- 
tein; 

(CO-) expressing said fusion protein comprising said first 
subdomain of the reporter protein and said first test peptide 
or protein, and said fusion protein comprising said second 
complementary subdomain of the reporter protein and said sec- 
ond test peptide or protein In a suitable prokaryotic or eu- 
karyotic host cell; 

screening and/or selecting for interruption of interaction of 
said first subdomain and said second subdomain under the in- 
fluence of one or more test agents. 

Comparable systems are commonly referred to in the literature as 
reverse two-hybrid systems (or spl±t-protein systems, respec- 
tively) . Exemplarily, 5-f luoroanthjranilic acid (FAA) is metabo- 
lized in vivo into a toxic product by the tryptophan biosyn- 
thetic enzymes. Applying the split— Trp sensors according to the 
invention, the disruption of prote±n-protein interaction leading 
to the spatial separation of the Tzrplp fragments (and thus inac- 
tivity of the reporter protein) can therefore be linked to the 
survival of the cells on medium containing FAA. By means of ex- 
ample, libraries of small molecules may be screened for their 
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ability to interact with a pair of fusion proteins. Selection of 
proteins or peptides that disrupt an interaction can be done by 
co-expressing two interacting proteins with a random protein or 
peptide library e.g. on plates containing FAA. The reverse 
5 split-Trp sensors may also advantageously be used to determine 
the binding region of a protein . A random library of the protein 
carrying mutations is co-expressed with its binding partner on 
plates containing FAA. Only cells that express a library member 
with mutations in or affecting the binding region, thus disrupt- 
10 ing the interaction of the two proteins, will be able to grow in 
the presence of FAA. 

Another aspect of the present invention is related to a use of 
random circular permutation of a gene and/or the expressed poly- 

15 peptide derived thereof for the identification of fragmentation 
sites in a reporter protein for use in a split-protein sensor. 
To date, random circular permutation has not been used for the 
identification of such suitable fragmentation sites for sepa- 
rately expressed subdomains, but rather for the identification 

20 of proteins of at least approximately wild-type length, but with 
artificially new N- and C-termini, and with the wild-type N- and 
C-termini being connected to each other and being an internal 
part of the sequence. However, this approach now surprisingly 
proved to be an outstandingly v^aluable tool for the evolution- 

25 ary, combinatorial approach of identifying suitable fragmenta- 
tion sites for subdomains to be expressed separately. 

A further aspect of the present invention is related to a use of 
a host cell line that allows for homologous recombination of DNA 
30 for the generation of a recombinant DNA molecule that secures 

for expression of both a polypeptide product comprising a first 
subdomain of a reporter protein and a complementary second sub- 
domain of a reporter protein from said recombinant DNA molecule. 
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To date, homologous recombination has not been used for this 
purpose, but has now surprisingly found to be an outstandingly 
valuable tool for simply and conveniently securing for expres- 
sion of a first subdomain and a complementary second subdomain 
of a reporter protein. 

Detailed description of the invention 

The invention will now be described in even more detail by means 
of an example and a specific embodiment, together with the ac- 
companying figures; however, without the invention being limited 
thereto. 

Fig. 1: Combinatorial approacln towards the generation of split- 
Trp sensors. As a starting point, a rearranged copy of 
the TRPl gene was used in which the original N- and C- 
tejnnini of TRPl were connected by a short linker encod- 
ing a unique restriction site RE2, here an Avrll site. 
For convenient subcloning, another restriction site REl 
was introduced at the artificially created new N- and 
C-termini, here a Hindlll site. The linear fragment was 
incubated with T4 DNA ligase to circularize/oligomerize 
the gene (step 1) . Treatment of the ligation mix with 
DNAsel resulted in randomly cut linear molecules and 
fragments corresponding to the size of TRPl were iso- 
lated (step 2) . Isolated fragments were cloned into a 
yeast expression vector containing two polypeptides (CI 
and C2) that associate into an antiparallel-coiled coil 
(step 3) . Homologous recombination in yeast cells was 
used to insert a terminator sequence and the Pgali- 
promoter between the original N- and C-termini (step 
4) . Co-expression of the two fragments and selection 
for complementation of tryptophan auxotrophy of yeast 
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cells allowed the isolation of functional split-Trp 
pairs . 

Fig. 2: Selected split-Trp proteini pairs capable of compleitient- 
5 ing tryptophan auxotrophy in yeast. The clones are 

named after the last residue of each N-terminal frag- 
ment. CI and C2 are the tvsT-o polypeptides that associate 
into the anti-parallel colled coil. Due to a shift in 
the reading frame in 5 of the twelve clones, C2 is re- 

10 placed by peptide of 10 oxr 66 amino acids, and CI is 

replaced in one clone by a. peptide of 26 residues. Five 
of the twelve analyzed clones lead to the expression of 
Trplp fragments in which both fragments were fused in 
frame to the polypeptides CI and C2 (marked with an as- 

15 terisk) • 

Fig. 3: Characterization of the selected split-Trp pairs that 

are marked with an asterisk in Fig. 2. Growth assays of 
yeast strains expressing split-Trp^^, split-Trp", 

20 split-Trp^®'', split-Trp^°^^ or split-Trp"'"^ on selective 

plates ( + /A trp: plates wd_th tryptophan / lacking tryp- 
tophan, respectively; +/A gal: plates with galactose / 
lacking galactose) . For control experiments, yeast 
strains expressing the sp3-it-Trp proteins in which the 

25 sequence encoding for C2 was deleted form the plasmid 

(split-Trp-AC2) were also investigated. One colony of 
yeast cells EGY48 expressing different split-Trp pro- 
tein pairs was resuspended. in 1ml water and S^il were 
spotted on medium with or without tryptophan and/or ga- 

30 lactose, but always containing copper at two different 

temperatures (30 ^'C and 23° C) . Cl-Ctrp is under control 
of the leaky Pcapj-promoter and Ntrp-C2 under the control 
of the PcALi-promoter . Imag-es were taken after 8 days. 
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Fig. 4: Analysis of the interaction between Sec62p and Sec63p 

using the split-Trp system. Left: Ntrp is fused to the N 
terminus of Sec62p and Ctrp is fused to the C terminus 
5 of Sec63p, resulting in Ntrp-Sec62p and Sec63p-Ctrp/ re- 

spectively. The linker between the cytosolic domains of 
Sec62p and Sec63p and the corresponding Trplp fragments 
consists of six residues. The known interaction between 
the positively charged cytosolic N-terminal domain of 

10 Sec62p and the negatively charged C-terminal tail of 

Sec63p should lead to the reconstitution of active 
Trplp and complementation of tryptophan auxotrophy. 
Right: Co-expression of Ntrp-Sec62p with Stel4p-Ctrp/ a 
further membrane protein of the ER, which does not 

15 interact with Sec62p, should not lead to the formation 

of a functional Trplp and the complementation of 
tryptophan auxotrophy. 

Fig. 5: Split-Trp interaction assay of Sec62p and Sec63p. A 
20 colony of EGY48 cells co-expressing Ntrp-Sec62p with 

Sec63p-Ctrp or Stel4p-C-trp was suspended in 1 ml water 
and 5 \xl were spotted on copper containing medium with 
or without tryptophan. Cells co-expressing ^'^Ntrp- 
Sec62p/ Sec63p-^^Ctrp complement tryptophan auxotrophy 
25 as indicated by their growth after 4 days at 23 °C. 

Large colonies were visible after 7 days of incubation^ 
whereas only small colonies were observed for cells ex- 
pressing "''Ntrp-Sec62p/ Sec63p-^®''Ctrp. No or only very 
small colonies were observed for cells co-expressing 
30 "Ntrp-Sec62p/ Sec63p-"Ctrp or 204bj^^^-sec62p/ Sec63p- 

^^^^Ctrp/ respectively- No growth was observed for cells 
co-expressing ^%trp-Sec62p/ Stel4p-^^Ctrp or ""^Ntrp- 
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Sec62p/ Stel4p-^®''Ctrp even after 10 days of incubation 
at 23''C. 

DNA- and protein sequences SEQ ID NO: 1 to SEQ ID NO: 66, as 
5 given in the attached sequence listing, are given in the at- 
tached sequence listing, incl . all primers and oligonucleotides 
used for the construction of the vectors. 

For any standard molecular biology and especially DNA- and pro- 
10 tein manipulation protocols it is generally referred to Sam- 
brook, J. et al., eds-. Molecular Cloning, A Laboratory Manual, 2nd. 
edition. Cold Spring Harbor Laboratory Press, Cold Spring Har- 
bor, N.Y. (1989); Ausubel, F. et al., eds.. Current Protocols in Mo- 
lecular Biology, John H. Wiley & Sons, Inc. (1997); HYBRID HUNTER™ 
15 INSTRUCTION MANUAL, Invitrogen BV, Groningen, Neherlands (1999); 
Burke, D et al., METHODS IN YEAST GENETICS. A COLD SPRING HARBOR 
LABORATORY COURSE MANUAL, Cold Spring Harbor Laboratory Press 
(2000) . 

20 Yeast Media. Yeast complete medium containing adenine (YPAD) was 
used for cultures of Saccharomyces cerevisiae EGY48 and RSY529. 
Dropout media (YC) were used to select for the presence of 
pRS315- or pRS316-derived plasmids and for the complementation 
of tryptophan auxotrophy. Lacking amino acids or components in 

25 the resulting medium are indicated by the addition of their one- 
letter code to the YC- dropout medium. Selective YC-medium used 
to plate out the yeast cells after transformation by electropo- 
ration was supplemented with 1 M sorbitol. For the expression of 
proteins from the PGaij-promoter 2 % galactose and 0.5 % raffinose 

30 replaced glucose as carbon source in the YC-medi\im. 

YPAD: 1 % yeast extract, 2 % peptone, 2 % dextrose, 100 mg/ 1 

adenine, (2 % agar for plates) 
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YC 0.12 % yeast nitrogen base, 0.5 % ammonium sulfate, 1 % 

succinic acid, 2 % glucose, 0.6 % NaOH, 1.4 g/ 1 yeast 
synthetic dropout medium omitting histidine (H) , leu- 
cine (L) , tryptophan (W) and uracil (U) , (2 % agar for 
5 plates ) , 

-L: 0.05 g/ 1 histidine (H) , 0 . 1 g/ 1 tryptophan 

(W) , 0. 1 g/ 1 uracil (U) 
-U: 0.05 g/ 1 histidine (H) , 0.1 g/ 1 leucine 

(L) , 0.1 g/ 1 tryptophan (W) 
10 -LO: 0.05 g/ 1 histidine (H) , 0 . 1 g/ 1 tryptophan 

(W) 

-LUW: 0.05 g/ 1 histidine (H) 

Transformation of yeast cells. The transformation of Saccharomy- 

15 ces cerevisiae strains EGY48 or RSY529 with one or more plasmids 
was done using a standard protocol for transformation by elec- 
troporation. An overnight culture of EGY48 or RSY529 yeast cells 
in YPAD medium was diluted in 500 ml YPAD to an ODeoo of ~0.3 and 
grown at 30°C and 260 rpm to an ODeoo of -1.4. The culture was 

20 harvested by centrif ugation at 430O rpm and washed with 500 ml 
and 250 ml ice-cold sterile water and with 30 ml ice-cold 1 M 
sorbitol. The pelleted cells were then resuspended in 300 - 500 
jil 1 M sorbitol and either used dxrectly for transformation or 
frozen in aliquots of 40 \il at -80° C. For the double transforma- 

25 tion of two plasmids, competent cells were always prepared 
freshly. A total amount of 100 ng plasmid DNA was mixed with 40 
^1 competent yeast cells, and electroporated at 1.5 kV using a 
Stratagene elect roporator 1000 in a 0.2 mm cuvette. The cells 
were mixed with 500 \il ice-cold 1 M sorbitol immediately after 

30 the pulse and plated on the corresponding solid selective YC- 
medium containing 1 M sorbitol. 
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Cloning of pRS316-Cl/2cupi. A sequence containing two polypeptides 
CI and C2 was first assembled by PGR using a set of primers as 
described by Stammer at al (cf. Oakley and Kim P.S. 

(1998), Biochemistry 37, 12603-12610; Oakley M.G, and Hollenbeck 
J.J. (2001), Curr Opin Struct Biol 11, 450-457; Stemmer W.P., 
Crameri A,, Ha K.D., Brennan T.M., Heynecker H.L. (1995), Gene 
164, 49-53; all incorporated herein by reference) . In short, the 
primers were mixed in an equimolar concentration (12.5 iM of 
each primer) and assembled in 55 cycles of denaturation (94''C, 
30 s), primer annealing (52*'C, 30 s) and extension (72°C, 30 s) 
using 0.1 unit/ \xl Pwo polymerase and 0.5 mM of each dNTP in the 
gene assembly buffer (10 mM Tris-HCl, pH 8.8, 2 . 2 mM MgCl2/ 50 mM 
KCl and 0.1 % Triton X-100) . The doable gene was then amplified 
out of this reaction using Pwo polymerase with the 5'- primer 
PTP116 that contains an EcoRI site and the 3' -primer PTPlll that 
contains a Sail site. The PCR product was cleaved with EcoRl .and 
Sail and cloned into pRS316, resulting in pRS316-Cl/2 (cf . Si- 
korski R.S, and Hieter, P. (1989), Genetics 122, 19-27, incorpo- 
rated herein by reference) . The final construct contained the 
sequences for an N-terminal FLAG tag, the polypeptide CI fol- 
lowed by a five-residue linker, an Hpal blunt end restriction 
site and a six-residue-linker followed by the polypeptide C2 
with a C-terminal HA tag. CI and C2 are two peptides that asso- 
ciate into an antiparallel-coiled coil (cf. Oakley M.G., and Kim 
P.S. (1998), Biochemistry 37, 12603-12610; Oakley M.G, and Hol- 
lenbeck J.J, (2001), Curr Opin Struct Biol 11, 450-457). The se- 
quence of the PcOTi-promoter was then cleaved out of the plasmid 
pAGTM2-Dha with JBamHI and EcoRl and positioned upstream of the 
CI/ C2 cassette in pRS316-Cl/2, resulting in pRS316-C1/2otpi . 

Cloning of pRS315copi and of pRS316cuPi. The pRS315-derived vector 
was constructed for an easy cloning of the different Ntrp-SEC52 
constructs, whereas the pRS316-derived vector was constructed 
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for an easy cloning of the different S£C ^3-Ctrp constructs (cf. 
Sikorski R.S, and Hieter, P. (1989), Genetics 122, 19-27, incor- 
porated herein by reference) . The sequence of the Pcc;pi -promoter 
of the plasmid pAGTM2-Dha was amplified by PGR with the primers 
5 PTP181 and PTP182. The gene of ECFP was amplified by PGR out of 
pLP-EGFP-Cl with the primers PTP183 and PTP184. Both fragments 
were then combined by overlap extension PGR using the 5' -primer 
PTP181 that contains a BamHl site and the 3' -primer PTP184 that 
contains a Sail site, so that the Pcapi-p^comoter is upstream of 

10 ECFP (of. Ho S.N, et al. (1989), Gene 1989, 51-59; incorporated 
herein by reference) . The partially homologous primers PTP182 
and PTP183 contain the sequence of the restriction sites EcoRl, 
Bglll and Avrll to allow a versatile cloning of genes downstream 
of the Pcc/pi-promoter. The final fragment consisting of Fcupi" 

15 promoter and ECFP was then cloned into pRS315 or pRS316 with 

BamHI and Sail, resulting in pRS315otpi ojo pRS316capi (cf . Sikorski 
R.S, and Hieter, P. (1989), Genetics 122, 19-27). 

To generate split-protein sensors based on Trplp (split-Trp) we 
20 adapted an approach originally developed by Graf and Schachmann 
for creating random circular permutations of proteins (cf . Graf, 
R., and Schachman, H.K. (1996), Proc Nata Acad Sci USA 93, 
11591-11596, incorporated herein by reference) . Using PGR, the 
TRPl gene of Saccharomyces cerevisiae was first rearranged so 
25 that it started with residue 63 and its former start codon was 
fused to the stop codon via a linker seqxaence encoding a unique 
Avrll restriction site. The N- and the C— terminal domains of 
TRPl were therefore amplified separately out of the plasmid pY- 
ESTrp2 (Invitrogen) with the primers PTP113/ 115 and PTP112/ 
30 114, respectively, and recombined using overlap extension PGR 
with the primers PTP112 and PTP115 (cf. Ho S.N. et al. (1989), 
Gene 1989, 51-59; incorporated herein by reference) . This rear- 
rangement was performed to avoid unwanted isolation of wild-type 
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gene in the subsequent selections. At the same time, a Hindlll 
restriction site was introduced via the PGR primers at the newly 
generated N- and C- termini by introducing a silent mutation in 
the gene at around amino acid 63. Since the direct digestion of 
PGR products in former experiments yielded a product that did 
not ligate efficiently, the rearranged gene was first inserted 
into a high-copy plasmid (pAK4 00) and, after amplification of 
the vector DNA, cut out with ifindlll. The rearranged gene was 
then incubated with T4 DNA ligase at 16°C for 14 h at a DNA con- 
centration of 0.14 mg/ ml, leading to the formation of circular 
DNA as well as dimers and higher oligomers. After inhibition of 
the ligase at 65 ""G for 20 min and desalting of the solution us- 
ing a microcon PGR column, the ligation products were incubated 
with DNasel 1.2 units/ mg DNA) in 50 mM Tris-HGl, pH 7.5, 1 
mM MnGla at 25*'G for six minutes. The exact conditions for the 
DNasel reactions were determined immediately before the diges- 
tion in small test reactions. The DNasel reaction was stopped by 
phenol extraction and ethanol precipitation. After incubation of 
the digested DNA with T4 DNA ligase and T4 polymerase to repair 
nicks, gaps and to flush the ends of the fragments, DNA frag- 
ments corresponding to the size of the original gene were iso- 
lated by gel electrophoresis. These fragments were ligated into 
the pRS316-based yeast expression vector pRS316-Gl/2capi that was 
cleaved with Hpal and dephosphorylated according to standard 
protocols. In the resulting vector, the C-terminal half of TRPl 
is fused to a gene encoding for a FLAG tag^ a polypeptide Gl and 
a five-residue linker sequence and is expressed under the con- 
trol of the Pcapi-promoter. The N-tenainal half of rjRPl is fused 
to a gene encoding for a six-residue linker sequence, the poly- 
peptide G2 and a HA tag. The sequences of the peptides CI and 
G2, including epitope tag and linker are: 

Gl : MDYKDESGQAJLEKELAQNEWELQALEKELAQLEKELQAGSGSG , 
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(SEQ ID NO: 19) 

C2 : GGSGSGQALKKKLAQLKWKLQALKKKNAQLKKKLQAGSYPYDVPDYAAFL, 
(SEQ ID NO: 20) 

5 

After transformation in XLlBlue, resultirtg in a library with 
about 3 X 10^ independent clones, the bac-teria were scratched 
from the plate, and the plasmids isolated, and linearized with 
Avrll. To insert a terminator for the C-terminal fragment and a 

10 promoter for the N-terminal fragment, a DNA fragment was con- 
structed by PGR consisting of the CYCl terminator, a geneticin 
resistance gene, the PcAii-promoter and flanking regions of about 
50 base pairs at the 5' -and 3' -ends homologous to the original N 
and C termini of Trplp. The CYCl-terminat or was amplified out of 

15 pYESTrp2 with the primers PTP107 and PTP120, whereas the cas- 
sette containing the geneticin resistance gene and the Pgali- 
promoter was amplified out of pFASa-GALl with the primers PTP108 
and PTP121, Both fragments were combined by overlap extension 
PGR using the primers PTP120 and PTP121 (cf. Ho S.N. et al. 

20 (1989), Gene 1989, 51-59; incorporated herein by reference). The 
linearized vector (0.3 iig) and the PGR fr-agment (3 jug) were then 
co-transformed in chemically competent EGY48 cells and plated on 
plates lacking uracil but containing geneticin (500 p.g/ml) to 
select for insertion of the PGR fragment into the linearized 

25 vector through homologous recombination (cf. Oldenburg et al. 
(1997), Nucleic Acids Res 25, 451-452; incorporated herein by 
reference) . Chemically competent yeast cells were prepared as 
described by standard protocols. The homologous recombination 
also suppressed the predominant isolation, of TRPl genes that 

30 were cut near the original N or C terminms . In the final con- 
struct, the G-terminal fragment fused to CI (Cl-Gtrp) is under 
the control of the inducible but leaky Pcapi-promoter and the N- 
terminal fragment fused to C2 (Ntrp-C2) is under the stringent 
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control of the PGALi-promoter . After 3 days of incubation at 30 ""C^ 
approximately 1600 colonies were isolated and subsequently rep- 
lica-plated on plates lacking uracil and tryptophan but contain- 
ing geneticin (250 \iq/ ml), galactose (2 %) and CUSO4 (0.1 mM) . 
5 After replica plating, 45 colonies were able to complement tryp- 
tophan auxotrophy. Approximately half of those 45 colonies re- 
quired the presence of galactose and CUSO4 to grow on plates 
lacking tryptophan and twelve of these clones were then analyzed 
by DNA sequencing (Figure 2) . Five of the twelve analyzed clones 

10 lead to the expression of Trplp fragments in which both frag- 
ments were fused in frame to the polypeptides CI and C2 (marked 
with an asterisk in Figure 2) . Seven of the twelve clones were 
out of frame with CI or C2 . These frame shifts resulted in the 
replacement of C2 in split-Trp^^^ and split-Trp^"'° with a peptide 

15 of 66 residues possessing the sequence 

DLDQVRHLRRSWRSLSGNCKLLRRRMPSLRRSSRLEVTHMFQITLHFYKSTSRGGPVPSFCSL 
and in split-Trp^^°, split-Trp^^^, split-Trp^^^ and split-Trp^°^^ 
with a peptide of 10 residues possessing the sequence 
(E/Q)RWIWIRSGT. It is assiimed that Ntrp and Ctrp of these clones 

20 associate spontaneously without the help of interacting pro- 
teins. In split-Trp^^ and split-Trp^°^^ the mutation GlySCys was 
introduced during the fragmentation procedure. However, the in- 
fluence of this mutation seems to be of minor importance as the 
deletion of the first ten amino acids still allowed split-Trp"''^ 

25 to complement tryptophan auxotrophy (Figures 2 and 3) . 



For split-Trp^^, split-Trp", split-Trp^^^, split-Trp^^^'' and split- 
Trp^^ the sequence encoding Ntrp-C2 was deleted from the plasmid 
using Bglll and Sail and replaced with, a PCR fragment encoding 
30 only the corresponding Ntrp fragment. The resulting constructs 
were then retransf ormed into EGy48 (Figure 3) . To test whether 
the trpl complementation depends on the presence of both Trplp 
fragments we repeated the growth assays on plates lacking tryp- 
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tophan and galactose but containing glucose and copper, thereby 
repressing the expression of Ntrp-C2 . Of the five clones tested, 
only split~Trp^^ conferred tryptophan auxotrophy to the trpl 
yeast in the presence of glucose by itself, indicating that its 
5 large C-terminal fragment spanning residues 11-224 already pos- 
sesses enzymatic activity. On galactose, split-Trp^^, split-Trp^®'' 
and split-Trp^^ complemented tryptophan auxotrophy at 30 °C and 
23 °C, whereas split-Trp^^ and split-Trp^°^^ complemented trypto- 
phan auxotrophy only at 23^*0 (Figure 3) . 

10 

The deletion of C2 abolished the capacity of the four clones 
split-Trp^S split-Trp", split-Trp^^'', split-Trp^^^^ to complement 
tryptophan auxotrophy (Figure 3) . This finding demonstrates that 
the formation of a functional Trplp from these fragments indeed 
15 depends on the fusion to a pair of interacting polypeptides. 

Since the structure of Trplp from S. cerevlslae has not yet been 
solved, we aligned its sequence with the sequences of the N-(5'- 
phosphoribosyl) -anthranilate isomerases from E. coll (ePRAI) and 

20 Thenaotoga maritlma (tPRAI), and identified the fragmentation 

sites in the known crystal structures of the homologous enzymes 
(Figure 4) . The fragmentation site of split-Trp"*^ lies in one of 
the active site loops between P2 and a2, two residues away from 
an arginine residue that interacts with the carboxyl group of 

25 the substrate N- (5' -phosphoribosyl) -anthranilate - Although com- 
binatorial mutagenesis experiments have indicated that turn se- 
quences in general are highly mutable in (p/a) s^barrels, the vi- 
cinity of this position to an active site residue would not .have 
made it an obvious candidate for a fragmentation site (cf . 

30 Silverman, J. A., Balakrishnan, R., and Harbury, P.B. (2001), 
Proc Natl Acad Sci U S A 5S, 3092-3097). In split-Trp^®'' and 
split-Trp^^ the fragmentation sites are located in a-helices al 
and a2 of the (p/a) 8-barrel, respectively. This appears plausible 
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in hindsight with the mutability of a~helical residues in combi- 
natorial mutagenesis experiments on (p/a) 8-barrels and with ear- 
lier random circular permutation experiments of other folds in 
which new termini were introduced into a-helices (cf . Silverman, 
5 J. A., Balakrishnan, R., and Harbury, P.B. (2001), Proc Natl Acad 
Sci USA 98, 3092-3097; Graf, R., and Schachman, H.K. (1996), 
Proc Natl Acad Sci U S A 53, 11591-11596) . Furthermore, a-helix 
a2 is extended by nine amino acids in Trplp compared to ePRAI 
and tPRAI, making it plausible that the introduction of a frag- 

10 mentation site could be tolerated without s±gnif icantly affect- 
ing the activity or the folding of the (p/a) s-barrel. Particu- 
larly interesting is split-Trp^°^^, in which a stretch of eight 
amino acids (205-212), including four highly conserved residues, 
is deleted from Trplp. This results in a very short Ctrp of only 

15 twelve residues that is fused to CI, corresponding to a-helix a8 
in the structure of tPRAI and ePRAI- The eight deleted amino ac- 
ids form a loop in the vicinity of the active site, directly af- 
ter the short a-helix a8' . Helix a8' is believed to participate 
in the binding of the phosphate group of the substrate and is 

20 not present in the regular structures of other ( (3/a) s-barrels 

(cf. Eder, J., and Kirschner, K. (1992), Biochemistry 31, 3617- 
3625; Hennig, M., Sterner, R., Kirschner, K-, and Jansonius, 
J.N. (1997), Biochemistry 36, 6009-6016). While split-Trp^°^^ com- 
plements tryptophan auxotrophy only at 23**C, indicating a de- 

25 creased stability of the split enzyme, this finding nevertheless 
questions the significance of this loop with its four completely 
conserved residues in the function of N- (5' -phopsphoribosyl) - 
anthranilate isomerases. However, it is unknown how much resid- 
ual Trplp activity is sufficient to complem.ent tryptophan 

30 auxotrophy in yeast and a more detailed interpretation of this 
finding will therefore require the kinetic characterization of 
Split-Trp^°^^ in in vitro assays. Eder and Kxrschner have shown 
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that the N-terminal fragment 1-167 folds ±n the absence of its 
C-terminal partner (cf, Eder, J., and Kirschner, K. (1992) Bio- 
chemistry 31, 3617-3*625) . Furthermore, it has been proposed that 
this N-terminal subdomain is an intermediate in the folding of 
5 Trplp (cf. Silverman, J.A. , Balakrishnan, , and Harbury, P.B. 
(2001), Proc Natl Acad Sci U S A 98, 3092-3097; Kirschner, K., 
Szadkowski, H., Henschen, A., and Lottspeich, F. (1980), J Mol 
Biol 143, 395-409; Jasanoff, A., Davis, B. , and Fersht, A.R. 
(1994), Biochemistry 33, 6350-6355; Silverrman, J.A., and Har- 

10 bury, P.B. (2002), J Mol Biol 324, 1031-1040; Sanchez del Pino, 
M.M., and Fersht, A.R. (1997), Biochemistry 36, 5560-5565). In 
agreement with these studies all of the selected split-Trp pairs 
that spontaneously assemble into a functional protein possess 
relatively large N-terminal fragments, incorporating at least 

15 the first five (P/a) -motives . This observation suggests that a 
spontaneous assembly of Trplp fragments depends on the presence 
of a folded N-terminal domain and that the location of the frag- 
mentation site reflects the folding pathway of the natural pro- 
tein. Shorter N-terminal fragments such as "^^Ntrp and ^^Ntrp might 

20 not fold independently and the chances to spontaneously recon- 
stitute active protein from unfolded fragments without induced 
proximity would be greatly diminished. Noteworthy, most of the 
isolated split-Trp pairs that reassemble spontaneously consist 
of Trplp fragments that overlap for at least 13 residues. This 

25 overlap prevents us to exactly localize the fragmentation site 
from the sequence data (Figure 2) . An exception is split-Trp"^ 
where, according to the structure of tPRAI, the fragmentation 
site is located in a loop at the N-terminal side of the (p/a)8- 
barrel. 

30 

Detection of membrane protein interactions using split-Trp sen- 



sors 
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An important application for new split-protein sensors will lie 
in the detection and characterization of protexn-protein inter- 
actions occurring at the membranes of intracellular organelles 
and the cell membranes. To test whether the split-Trp system op- 
erates at the membrane, the interaction-dependent split-Trp 
pairs were attached to the membrane proteins Sec62p and Sec63p 
(Figure 4) (cf. Panzner, S., Dreier, L., Hartmann, E., Kostka, 
S., and Rapoport, T.A. (1995), Cell 81, 561-57 0; Deshaies, R.J., 
and Schekman, R. (1989), J Cell Biol 109, 2653-2664; Wittke, S., 
Dunnwald, M., and Johnsson, N- (2000), Mol Biol Cell 11, 3859- 
3871) . Sec62p and Sec63p directly bind to each other and are 
part of the heptameric Sec-complex that is responsible for 
translocating proteins posttranslationally across the membrane 
of the endoplasmic reticulum (ER) (Figure 5A) . Briefly, SEC62 
was fused to the 3' -end of the N-terminal fragment of the four 
split-Trp systems, allowing for the expression of ^%trp-Sec62p, 
"Ntrp-Sec62p, ^®''Ntrp-Sec62p and ^°^*'Ntrp-Sec62p. SBC63 was fused to 
the 5' -end of the corresponding C-terminal fragments, allowing 
for the expression of Sec63p-^^Ctrpf Sec63p-"Ctirpf Sec63p-^®''Ctrp and 
Sec63p-^°^^Ctrp. 

To monitor the interaction between Sec62p and Sec63p, trpl yeast 
strains expressing pairs of matching Ntrp-"Sec62p and Sec63p-Ctrp 
fusion proteins were spotted on selective plates lacking trypto- 
phan (Figure 5) . Strains co-expressing ^^Ntrp-Sec62p/ Sec63p-^''Ctrp, 
^^''Ntrp-Sec62p/ Sec63p-^^^Ctrp and 204bj^^^p-sec62p/ Sec63p-2°'^Ctrp were 
able to grow on plates lacking tryptophan at 2 3°C but not at 
30''C. Only small colonies were detected after 7 days for ^^'^Ntrp- 
Sec62p/ Sec63p-^^'^Ctrp and after 10 days for 204b^trp-Sec62p/ Sec63p- 
^*^^^Ctrpr whereas strains co-expressing ^^Ntrp-'Sec62p/ Sec63p-^^Ctrp 
grew significantly faster. No growth at all was observed for 
strains expressing "Ntrp-Sec62p/ Sec63p-"Ctrp. To verify that the 
observed complementation of tryptophan auxotrophy is a result of 
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the interaction between the Sec62p and Sec6 3p moieties of the 
fusion proteins, we fused the C-terminal fragments of split-Trp^^ 
and split-Trp^^'' to the cytoplasmic site of Stel4p (Figure 4B) . 
Stel4p is a membrane protein of the ER that is known to interact 
5 with neither Sec62p nor Sec63p (Figure 4B) (of. Wittke, S., 
Lewke, N., Muller, S., and Johnsson, N. (1999), Mol Biol Cell 
10, 2519-2530) . No growth on plates lacking tryptophan was ob- 
served when matching pairs of Sec62p and Stel4p fusion proteins 
were co-expressed at 23°C or 30°C for 10 days (Figure 5) • The 

10 cellular amount of Stel4p-^^Ctrp is roughly 2-3 fold lower than 
the amount of Sec63p-^^Ctrp as determined by western blotting 
(data not shown) . Since this relatively small effect cannot ac- 
count for the clear growth difference between the strains ex- 
pressing either ^^Ntrp-Sec62p/ Sec63p-^^Ctrp ox ^^Ntrp-Sec62p/ 

15 Stel4p-^^Ctrpr we conclude that the ^%trp-Sec62p/ Sec63p-^^Ctrp in- 
teraction signal is specific. 

In more detail, the gene of SEC62 was amplified by PGR from 
yeast EGY48 genomic DNA and combined by overlap extension PGR 

20 with the N-terminal fragments of split-Trp^^, split-Trp^^, split- 
Trp^'^ and split-Trp^^^^, yielding '%trp-5£;C62, ^^l^trp- SEC 62, ^^Xrp- 
SEC62 and ^°^^Ntrp-SSC62. At the same time, a 6 x His tag was in- 
troduced at the 5' -end of Ntrp. The Ntrp genes and SEC 62 are con- 
nected by a sequence coding for a six-residue linker (GGSGSG) . 

25 The four Htrp-SEC62 PGR products were isolated by gel electropho- 
resis and ligated in a pRS315-derived expression vector (LEU2) 
(PRS315otpi) under the control of the Pcupi-promoter . Towards this 
aim, the vector was cleaved with Bglll and Sail and the ECFP 
gene was replaced by the corresponding Ntrp— construct, 

30 

The genes of SEC63 and STE14 were amplified by PGR from yeast 
EGy48 genomic DNA and combined by overlap extension PGR with the 
C-terminal fragments of split-Trp^^, split-Trp", split-Trp^" and 
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split-Trp^°^^. At the same time, a 6 x His tag was introduced at 
the 3' -end of Ctrp/ yielding SSC 53-^^Ctrp-His , SjE;C53-"Ctrp-His , 
S£;C ^^-^^'^Ctrp-His, SEC SJ-^^^^'Ctrp-His, STEI^-^'^Ctrp-His and STE14- 
^^^Ctrp-His. SEC63 and the Ctrp~His genes are conaected by a se~ 
5 quence coding for a six-residue linker (GGSGSG) . The different 
SEC53-Ctrp-His and STEI 4-Ctrp-His PGR products were isolated by 
gel electrophoresis and ligated into a pRS316-cierived vector 
(URA3) (pRS316otpi/ vide supra) under the control of the Pcupi- 
promoter. To replace the 6 x His tag by the more sensitive HA tag 

10 the genes of the different S£C53-Ctrp-His and sr^JI 4-Ctrp-His con- 
structs were amplified by PGR with a 3' -primer that contains an 
HA tag and cloned into pRSBlGcapi- All SEC63 and STE14 fusions 
contained an HA tag fused to the C terminus of Trplp. The vector 
was cleaved with Bglll and Sail and the ECFP gene was replaced 

15 with the corresponding S£:C53-Ctrp and STEl 4-Ctrp constructs- All 
constructs were verified by DNA sequencing. 

Expression of Ntrp-Sec62p fusion proteins- Expression and 
functionality of the Ntrp-Sec62p fusion proteins was confirmed by 

20 complementation of the temperature-sensitive yeast strain RSY529 
(MATa his4 leu2-3,112 ura3-52 sec62-l) (cf. Rothblatt J. A. et 
al. (1989), J Cell Biol 109, 2641-2652). RSY529 contains an en- 
dogenous temperature-sensitive variant of Sec62p. A colony of 
RSY529 cells transformed with either pRS315 or a pRS315-derived 

25 vector expressing ^%trp-Sec62p, "Ntrp-Sec62p, ^®'^Ntrp-Sec62p or 

^°^^Ntrp-Sec62p was resuspended in 1 ml water and 5 \xl were spotted 
on YC-L medium containing 0.1 mM CUSO4 to induce the expression 
of the fusion proteins and incubated at 30 °C and 38 °C for 6 d to 
control for the complementation of the temperature sensitivity 

30 of RSY529. 

Expression of Sec63p-Ctrp and Stel4p-Ctrp fusion proteins- The ex- 
pression of the different Sec63p-Ctrp and Stel4p-Ctrp fusion pro- 



wo 2005/038050 



44 



PCT/EP2004/011289 



teins was verified by immunoblotting using antibodies against 
the HA tag at the C terminus of Trplp. Towards this aim, an 
overnight culture of yeast EGY48 cells containing one of the 
Sec63p-Ctrp or Stel4p-Ctrp fusion proteins was diluted in 10 ml 
5 selective medium YC-U to an ODeoo ^ 0.8 and grown for 3 h at 30 °C 
and 220 rpm- Protein expression was induced by adding CUSO4 to a 
final concentration of 0.1 mM. After 3 h of expression at 30 °C 
and 220 rpm, the cell solution (same volume at same OD when dif- 
ferent samples were compared) was centrifuged at 4300 rpm for 10 

10 minutes and the pellet resuspended in 150 jil yeast lysis buffer 
(50 mM HEPES, pH 7.5, 150 mM NaCl, 5 mM EDTA, 1 % Triton X-100) 
containing 1 % (v/ v) protease inhibitor cocktail and 0.5 mM 
PMSF. 200 \il glass beads were added and the solution was vor- 
taxed at full speed for 3 x 30 s and cooled on ice in between the 

15 vortexing steps. The glass beads and the cell deloris were pel- 
leted by centrifugation for 30 s at 13000 rpm and the super- 
natant was mixed with an appropriate volume of 5 x SDS sample 
buffer (50 % glycerol, 7.5 % SDS, 250 mM Tris-HCl, pH 8-0, 0.5 % 
Bromphenol blue, 12 . 5 mM 2-Mercaptoethanol) . Proteins were dena- 

20 tured for 3 min at 95**C. Aliquot s were analysed loy Western blot- 
ting (12 % SDS-PAGE) as described by standard protocols- After 
blotting, the nitrocellulose membrane was incubated with 3 % dry 
milk in TBST (10 mM Tris-HCl, 150 mM NaCl, pH 7.9, 0-05 % Tween 
20) to block unspecific antibody binding- Expression of Sec63p- 

25 Ctrp or Stel4p-Ctrp fusion constructs was controlled by incubation 
of the membrane with the primary anti-HA antibody 1:7500 in TBST 
(10 mM Tris-HCl, 150 mM NaCl, pH 7.9, 0.05 % Tween 20). An anti 
mouse-HRP antibody conjugate was used 1:7500 in TBST (10 mM 
Tris-HCl, 150 mM NaCl, pH 7.9, 0.05 % Tween 20) as secondary an- 

30 tibody. Detection was done on a Kodak Image Station 440CF using 
the NEN Renaissance kit, a luminol-based chemiluininescence sys- 
tem. 
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The present data demonstrate that in particular split-Trp^^ is 
well suited for the detection of protein-protein interactions 
between membrane proteins - Interestingly, yeast cells co- 
expressing '^'^Ntrp~Sec62p and Sec63p-^^Ctrp require lower growth tem- 
5 peratures for the complementation of tryptophan auxotrophy than 
the cells expressing the corresponding CI and C2 coiled coil fu- 
sions. This effect might be due to a more favorable orientation 
of the N- and C-terminal Trplp fragments in the antiparallel- 
coiled coil than in the Sec62p/ Sec63p complex - 

10 

In conclusion, we have used directed evolution to convert N-(5'- 
phosphoribosyl) -anthranilate isomerase into a split-protein sen- 
sor. In coupling the interaction of cytosolic and membrane pro- 
teins to a simple growth assay, the split-Trp system possesses 

15 all the necessary features to complement already existing sys- 
tems to measure and screen for new protein interactions. This 
split-Trp approach may be used in identifying partners of medi- 
cally relevant targets, e.g. in three-hybrid assays and pro- 
tein/small molecule interaction assays. Furthermore, the evolu- 

20 tionary approach introduced here is generally applicable to 

other enzymes. By generating novel split-protexn sensors that 
are based on proteins functioning in the matrix of e.g. the mi- 
tochondrium, the peroxisome or the lumen of the secretory path, 
this evolutionary approach will help to overcome the lack of 

25 techniques to measure protein interactions in the interior of 

these organelles. Finally, the analysis of the different split- 
Trp pairs that either spontaneously assemble into a functional 
(p/ot) 8-barrel or need to be fused to interacting proteins to 
yield folded protein supports the hypothesis that a large N- 

30 teiminal subdomain of Trplp is an important intermediate in the 
folding of the 0/a)8- barrel. 

Further experimental details 
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For the various PGR- and gene assembly reactions, i.f not already 
noted explicitly above, the following primers and templates were 
used. 



Primers used for Ntrp-constructs (cf- attached sequence listing 
for details) : 



construct 


5 * -primer 


3 ' -primer 


template (s) 


cloning 
SI tes 




PTP193 

SEQ ID NO: 55 


PTP146 

SEQ ID NO: 65 


split-Trp'*^ 






tr 1 ir ± z? 

SEQ ID NO: 59 


It 1 IT ± ft O 

SEQ ID NO: 65 


split-Trp-'^ 






PTP193 

SEQ ID NO: 55 


PTP170 

SEQ ID NO: 43 


split-Trp^^ 






PTP193 

SEQ ID NO: 55 


PTP172 

SEQ ID NO: 45 


split-Trp^**^ 






PTP193 

SEQ ID NO: 55 


PTP174 

SEQ ID NO: 47 


split-Trp""^^ 




SEC62 


PTP147 

SEQ ID NO: 66 


PTP188 

SEQ ID NO: 53 


EGY4 8 yeast geno- 
mic DNA 




^^Ntrp-SEC62 


PTP193 

SEQ ID NO: 55 


PTP188 

SEQ ID NO: 53 


^^Ntrp/ SEC62, over- 
lap extension PCR 


BgllX/ Sail 


"Ntrp-SEC62 


PTP193 

SEQ ID NO: 55 


PTP188 

SEQ ID NO: 53 


^^Ntrp/ SEC62, over- 
lap extension PCR 


BgllZ/ Sail 


^®^Ntrp-SEC62 


PTP193 

SEQ ID NO: 55 


PTP188 

SEQ ID NO: 53 


^"Xrp/ SEC62, o- 
verlap extension 
PCR 


Bglll/ Sail 


^°*''Nt:rp-SEC62 


PTP193 

SEQ ID NO: 55 


PTP188 

SEQ ID NO: 53 


^"^''Nt-.rp/ SEC62, o- 
verlap extension 
PCR 


Bglll/ Sail 



Primers used for Ctrp-constructs (cf . attached sequence listing 
for details) : 



construct 


5 • -primer 


3 ' -primer 


template (s) 


cloning 
sites 


"Ctrp-His 


PTP155 

SEQ ID NO: 41 


PTP194 

SEQ ID NO: 56 


split-Trp^* 




^^Ctrp'"flA 


PTP155 

SEQ ID NO: 41 


PTP198 

SEQ ID NO: 58 


split-Trp" 




"C,rp-His 


PTP171 

SEQ ID NO: 44 


PTP194 

SEQ ID NO: 56 


split-Trp" 




"^Ctrp-His 


PTP173 

SEQ ID NO: 46 


PTP194 

SEQ ID NO: 56 


split-Trp^^^ 




"^^'Ctrp-His 


PTP175 

SEQ ID NO: 48 


PTP194 

SEQ ID NO: 56 


assembly PCR with 
primers PTP175, 
PTP176, PTP179, 
PTP180, PTP191, 
PTP192 
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SEC63 


PTP189 


PTP154 

^FO TD NO* 4 0 


EGY48 yeast geno- 
mic DNA 




STE14 


PTP195 

SEQ ID NO: 57 


PTP157 

SEQ ID NO: 42 


EGY48 yeast geno- 
mic DNA 


- 


SEC63-'*'*Crrp-His 


PTP189 

SEQ ID NO: 54 


PTP194 

SEQ ID NO: 56 


ciFffi*^/ '^^C —His. 

overlap extension 
PGR 


Bgill/ Sail 


SEC63-^"C.rp-His 


PTP189 

SEQ ID NO: 54 


PTP194 

SEQ ID NO: 56 


overlap extension 
PGR 


BglTX/ Sail 


SEC63--"Crrp-His 


PTP189 

SEQ ID NO: 54 


PTP194 

SEQ ID NO: 56 


^|.j-p— riiS/ 

overlap extension 
PGR 


Bglll/ Sail 


SEC63--'"'^^Ctrp-His 


PTP189 

SEQ ID NO: 54 


PTP194 

SEQ ID NO: 56 


overlap extension 


Bglll/ Sail 


STE14~ Ct:rp-HlS 


PTP189 

SEQ ID NO: 54 


PTP194 

SEQ ID NO: 56 


STE14/ ^^Ctrp-His, 
PGR 


Bcrlll/ Sail 


STE14-"''C^n-His 


PTP195 

SEQ ID NO: 57 


PTP194 

SEQ ID NO: 56 


c^TFl 4 / ^^^C^ —His 

overlap extension 
PGR 


Bglll/ Sail 


SEC63~^^Cf ri> 


PTP189 


PTP198 

ciFO TD NO" S8 


SEG63-^'*Ctrp-His 


Bglll/ Sail 


SEC63-"Ct.,^ 


PTP18 9 

SEQ ID NO: 54 


PTP198 

SEQ ID NO: 58 


SEC63-^^Ctrp-His 


Bglll/ Sail 


SEC63-"^Ct.rp 


PTP189 

SEQ ID NO: 54 


PTP198 

SEQ ID NO: 58 


SEG63-"*'Ctrp-His 


Bglll/ Sail 


SEC63~*~°^^Ct:i:p 


PTP189 

SEQ ID NO: 54 


PTP198 

SEQ ID NO: 58 


SEOeS-^^^^'Ctrp-His 


Bglll/ Sail 


STE14-'*^C^.^p 


PTP195 

SEQ ID NO: 57 


PTP198 

SEQ ID NO: 58 


STE14-^^Ct:rp-His 


Bglll/ Sail 


STE14-^^''Ctrp 


PTP195 

SEQ ID NO: 57 


PTP198 

SEQ ID NO: 58 


STE14-^^''Ctrp-His 


Bglll/ Sail 



Primers used for zipper construction (cf . attached sequence 



listing for details) : 



SEQ 


ID 


NO: 


22: 


PTP22 


SEQ 


ID 


NO: 


23: 


PTP23 


SEQ 


ID 


NO: 


24: 


PTP24 


SEQ 


ID 


NO: 


25: 


PTP28 


SEQ 


ID 


NO: 


26: 


PTP29 


SEQ 


ID 


NO: 


27: 


PTPlOO 


SEQ 


ID 


NO: 


28: 


PTPllO 


SEQ 


ID 


NO: 


29: 


PTPlll 


SEQ 


ID 


NO: 


34: 


PTP116 


SEQ 


ID 


NO: 


35: 


PTP117 
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SEQ ID NO: 36: 
SEQ ID NO: 37: 



PTP118 
PTP119 



10 



15 



Primers used for the copper promoter (cf. 
listing for details) : 



attached sequence 



SEQ ID NO: 49 

SEQ ID NO: 50 

SEQ ID NO: 51 

SEQ ID NO: 52 



PTP181 
PTP182 
PTP183 
PTP184 



Primers used for circular permutation of Trplp (cf . attached se- 
quence listing for details) : 



SEQ ID NO: 30 

SEQ ID NO: 31 

SEQ ID NO: 32 

SEQ ID NO: 33 



PTP112 
PTP113 
PTP114 
PTP115 



Primers used for homologous recombination (cf . attached sequence 
listing for details) : 



20 SEQ ID NO: 63 

SEQ ID NO: 64 

SEQ ID NO: 38 

SEQ ID NO: 39 



PTP107 
PTP108 
PTP120 
PTP121 



25 



